Excel BI - Excel Challenge 786

excel-challenges

excel-formulas

🔰 Find the sum of various items where number may be before or after the item names.

Published

March 24, 2026

Illustration for Excel BI - Excel Challenge 786

Challenge Description

🔰 Find the sum of various items where number may be before or after the item names.

Solutions

library(tidyverse)
library(readxl)

path = "Excel/700-799/786/786 Sum of Items.xlsx"
input = read_excel(path, range = "A2:A9")
test  = read_excel(path, range = "B2:C7")

result = input %>%
  separate_longer_delim(col = "Data", delim = " / ") %>%
  mutate(Quantity = str_extract(Data, "\\d+") %>% trimws(),
         Item = str_extract(Data, "\\D+") %>% trimws()) %>%
  mutate(Items = ifelse(str_detect(Item, "s$"), str_replace(Item, "s$", ""), Item)) %>%
  summarise(Total = sum(as.numeric(Quantity), na.rm = TRUE), .by = Items)

all.equal(result, test)
# > [1] TRUE

Logic: Read the workbook ranges needed for the challenge; Derive the required intermediate columns; Parse the packed text or string structure; Aggregate or rank the data at the required grouping level.
Strengths: The solution stays close to the text pattern itself, which makes the extraction logic easy to audit.
Areas for Improvement: The solution assumes the workbook layout and selected ranges remain stable, so any structural change in the sheet would require small adjustments.
Gem: A small number of well-targeted text patterns does most of the heavy lifting.

import pandas as pd
import re

path = "700-799/786/786 Sum of Items.xlsx"
input = pd.read_excel(path, usecols="A", skiprows=1, nrows=8)
test = pd.read_excel(path, usecols="B:C", skiprows=1, nrows=5).sort_values(by='Items').reset_index(drop=True)

input_long = input['Data'].dropna().str.split(' / ').explode().reset_index(drop=True)
df = input_long.str.extract(
    r'(?:(?P<Quantity>\d+)\s*(?P<Item>[A-Za-z]+)|(?P<Item2>[A-Za-z]+)\s*(?P<Quantity2>\d+))'
)
df['Item'] = df['Item'].combine_first(df['Item2'])
df['Quantity'] = df['Quantity'].combine_first(df['Quantity2'])
df = df[['Quantity', 'Item']]
df['Items'] = df['Item'].str.rstrip('s').fillna('')
df['Quantity'] = pd.to_numeric(df['Quantity'], errors='coerce')
result = (
    df.groupby('Items', as_index=False)['Quantity']
    .sum()
    .rename(columns={'Quantity': 'Total'})
    .sort_values(by='Items')
    .reset_index(drop=True)
)

print(result.equals(test)) # True

The Python version follows the same grouped logic and keeps the transformation explicit in a dataframe pipeline.

Difficulty Level

Medium

The individual steps are manageable, but the correct transformation pattern is not obvious from the raw data.